$webwork.htmlEncode($page.space.name) : 1 How a GetFeature Request Works

This page will describe how a getFeature() request works in Geoserver. It will also show you how to set up Eclipse and use the debugger to run through a getFeature() request yourself.

DataSet

A dataset is the data that will be read by the getFeature() request. It is what makes up our maps.

Here is an example of a PostGIS database (or a shapefile) with the following data in it:

tiger_ny=# select state_name,state_abbr,persons::int4,families::int4,houshold::int4,male::int4,
           female::int4, geometrytype(the_geom) as the_geom from states_postgis;
  state_name   | state_abbr | persons  | families | houshold |  male   | female  |   the_geom
---------------+------------+----------+----------+----------+---------+---------+--------------
 Iowa          | IA         |  2776755 |   740819 |  1064325 | 1344802 | 1431953 | MULTIPOLYGON ...
 Massachusetts | MA         |  6016425 |  1514746 |  2247110 | 2888745 | 3127680 | MULTIPOLYGON ...
 Nebraska      | NE         |  1578385 |   415427 |   602363 |  769439 |  808946 | MULTIPOLYGON ...
 New York      | NY         | 18235907 |  4548344 |  6746555 | 8739138 | 9496769 | MULTIPOLYGON ...

 Pennsylvania  | PA         | 11881643 |  3155989 |  4495966 | 5694265 | 6187378 | MULTIPOLYGON ...
 Connecticut   | CT         |  3287116 |   864493 |  1230479 | 1592873 | 1694243 | MULTIPOLYGON ...
 Rhode Island  | RI         |  1003464 |   258886 |   377977 |  481496 |  521968 | MULTIPOLYGON ...
 New Jersey    | NJ         |  7484736 |  1962314 |  2687478 | 3622220 | 3862516 | MULTIPOLYGON ...
 Indiana       | IN         |  5544159 |  1480351 |  2065355 | 2688281 | 2855878 | MULTIPOLYGON ...
 Nevada        | NV         |  1201833 |   307400 |   466297 |  611880 |  589953 | MULTIPOLYGON ...
...

NOTE: This discussion is pretty much the same for any datastore (ie. with oracle, shapefiles, DB2, SDE, etc...).

Iowa          | IA         |  2776755 |   740819 |  1064325 | 1344802 | 1431953 | MULTIPOLYGON ...

A getFeature request will be targetted at the data and will bring back Features in the form of GML.

Request

A request can be sent to Geoserver as a GET or a POST, both are handled similarly.

The getFeature process keeps the distinction between a GET and POST until it hits the FeatureRequest object: org.vfny.geoserver.wfs.requests.FeatureRequest. Once you hit FeatureRequest, the code isn't forked and the request works from one spot, execute(). Read on for more details.

GET and POST

http://localhost:8080/geoserver/wfs?
request=getfeature&
service=wfs&
version=1.0.0&
typename=states&
filter=<ogc:Filter
xmlns:ogc="http://ogc.org" xmlns:gml="http://www.opengis.net/gml">
<ogc:BBOX>
<ogc:PropertyName>the_geom</ogc:PropertyName>
<gml:Box srsName="http://www.opengis.net/gml/srs/epsg.xml">
<gml:coordinates>-73.99312376470733,40.76203427979042 -73.9239210030026,40.80129519821393</gml:coordinates>
</gml:Box>
</ogc:BBOX>
</ogc:Filter>

http://localhost:8080/geoserver/wfs

<wfs:GetFeature service="WFS" version="1.0.0"
  outputFormat="GML2"
  xmlns:topp="http://www.openplans.org/topp"
  xmlns:wfs="http://www.opengis.net/wfs"
  xmlns:ogc="http://www.opengis.net/ogc"
  xmlns:gml="http://www.opengis.net/gml"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.opengis.net/wfs
                      http://schemas.opengis.net/wfs/1.0.0/WFS-basic.xsd">
  <wfs:Query typeName="states">
    <ogc:Filter>
      <ogc:BBOX>
        <ogc:PropertyName>the_geom</ogc:PropertyName>
        <gml:Box srsName="http://www.opengis.net/gml/srs/epsg.xml#4326">
           <gml:coordinates>
               -73.99312376470733,40.76203427979042 -73.9239210030026,40.80129519821393
           </gml:coordinates>
        </gml:Box>
      </ogc:BBOX>
   </ogc:Filter>
  </wfs:Query>
</wfs:GetFeature>

Exploring the HTTP GET request URL

The server address points to where your Geoserver instance is running. In this example, on the local machine on port 8080.

The request type is the command that you are sending to the server. In this case the URL is asking "get me some features". There are other commands that can be sent:

Service type tells the server what service mode you want. Here we want WFS. Another possible service is WMS.

The type name is the FeatureType that you are querying, also known as the data. In our example, a shapefile that contains US states.

The filter is a restriction on our query. It pretty much says "restrict my query to only features in this bounding box". There are many filters you can use, but we will not explore them in this tutorial. Here is the sleep inducing OGC Filter specification if you really want to learn more.

How Geoserver interperets the request

Entry Point

When the request comes in, the servlet container (ie. jetty or tomcat) will send the request to the WfsDispatcher. This is the entry point for Geoserver to process the results.

You can set up where this entry point is by changing your web.xml file. Located in %GEOSERVER_HOME%/server/geoserver/WEB-INF

<servlet-name>WfsDispatcher</servlet-name>
    <servlet-class>org.vfny.geoserver.wfs.servlets.WfsDispatcher</servlet-class>
  </servlet>
...
 <servlet-mapping>
    <servlet-name>WfsDispatcher</servlet-name>
    <url-pattern>/wfs/*</url-pattern>
  </servlet-mapping>

This says that any request to "wfs/*" will get routed to the org.vfny.geoserver.wfs.servlets.WfsDispatcher servlet. Since both our requests (GET and POST) are to "http://localhost:8080/geoserver/wfs", the servlet container (ie. jetty or tomcat) will send the request to the WfsDispatcher.

WFS Dispatcher

There are two main methods in WfsDispatcher.java (located in org.vfny.geoserver.wfs.servlets):

public void doPost(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException

public void doGet(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException

If you send an HTTP POST request, doPost gets called. If you send an HTTP GET request, doGet gets called. Get it?

POST
Since one cannot read the POST portion of a HTTP request more than once, a copy of the request is written to disk. This is done in WfsDispatcher.doPost().

NOTE: This is somewhat inefficient, but we originally done this way because very large feature insert requests can be very large. Holding the request in memory would not be scalable. A better (and faster) solution would be to have a simple class that would either hold a small request in memory or, if the request is large, write it to disk.

DispatcherXMLReader will then use SAX to parse the XML. It looks at the first tag in the XML request (see DispatchHandler) and can determine the request type from that. In our case (see above), its "<wfs:GetFeature>" so we know this is a Dispatcher.GET_FEATURE_REQUEST request.

GET
This case is very easy - it just looks at the request url for the "request=GetFeature". You can see that in WfsDispatcher.doGet() and DispatcherKvpReader.

NOTE: "Kvp" means "Key-Value Pair". For the clause "request=GetFeature", the Key is "request" and the value is "GetFeature".

Whether it was an HTTP GET or POST, the WfsDispatcher will create an appropriate servelet. Remember the 6 different request types: GetFeature, Transaction, LockFeature, GetFeatureWithLock, GetFeatureInfo, GetCapabilities? For 'GetFeature' it will create a Feature servelet: org.vfny.geoserver.wfs.servlets.Feature

The GET or POST request information is then passed to that servelet, along with a response object that the servelet will populate. The response will be described later in the response section.

The diagram below shows where the distinction between GET and POST ends:

Not every object or class in the above diagram recognizes the difference, web.xml for example, but it is more to show the life of the GET and POST distinction.

Feature Servelet

The next stage of the getFeature request creates a FeatureRequest object and populates it with the query information.

Depending on whether a GET or a POST was used, different parsers are selected.

The two parsers are GetFeatureKvpReader, for HTTP GET, and GetFeatureXMLReader, for HTTP POST. These will then create the FeatureRequest object.

The FeatureRequest object will then head over to the feature type that was specified in the URL, in this example "states", and query the data.

Response

The Response is the processing of what is sent back to the user after their request. The format of this rsponse, for a getFeature request, is GML.

Where it Starts

At the very beginning, in WfsDispatcher, an HttpServeletResponse is passed into doGet() and doPost()

public void doPost(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException

public void doGet(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException

This response object is passed into the Feature servelet, so it can be populated once it has a hold of a FeatureReader.

Output Strategy Object

The output strategy object tells Geoserver how to proceede when returning the data. What does this mean? Here are some examples that are specified in the web.xml file to help explain it:

<context-param>
    <param-name>serviceStratagy</param-name>
    <!-- Meaning of the different values :
         BUFFER
         - stores the entire response in memory first, before sending it off to
           the user (may run out of memory)

         SPEED
         - outputs directly to the response (and cannot recover in the case of an
           error)

         FILE
         - outputs to the local filesystem first, before sending it off to the user
      -->
    <param-value>SPEED</param-value>
  </context-param>

Why is it called Strategy? A Strategy is a design pattern that is defined in the Gang of Four Design Patterns book (ISBN 0201633612). What it essentially does, is allow the user to plug in their own method of performing a specific task. So what Geoserver does is read the web.xml file, see what strategy you want to use ('speed' in our example), and plug it into the output response process.

You can define your own output strategy object by looking in org.vfny.geoserver.servlets.AbstractService. It must implement org.vfny.geoserver.servlets.AbstractService.ServiceStrategy

Feature Streaming

Its very important to note that the Geoserver/Geotools design allows for Feature Streaming, meaning that Geoserver only ever has around one feature in memory at a time. This is very important for large queries (or you'd run out of memory) as well as simutaneously doing multiple queries.

When a DataStore accepts a Query, it doesnt actually return Features, instead it returns a FeatureReader which can be used to read the Feature that the Query selects one-at-a-time. The delegate (ie. GML2 producer in our example) reads a single feature, converts it to GML2 and send the results off to the output Strategy object.

GML Encoding

After the output strategy has been determined, the Feature sends the output to a FeatureResponse object. This feature response object will then pass on the information to the GML2FeatureResponseDelegate object.

The GML encoding object will take care of the rest of the output for you that will be streamed through the output strategy.

The End

Notes

How Datastores process Query & Filter

Some DataStores (like the Database backed ones) can do most of the Filter processing in the database using the database's indexes. Other datastores can do "quick" processing of certain components of the Filter. For example, the "normal shapefile" datastore can quickly do bounding-box tests for features. The "index shapefile" datastore (thats a shapefile with a .qix spatial index file) can do spatial searching quickly.

Basically, the Filter object is sent to the datastore which looks at the Filter and beaks it into components:

All this is handled transparently by the datastore so the programmer just has to send a Query object off to the DataStore and not have to worry about how it processes it. The features returned by the FeatureReader will only be ones that pass the Filter conditions.

SELECT ... FROM <table> WHERE
     the_geom && <bounding box of the search polygon> -- this is the spatial index operation
 AND intersects(the_geom, <the polygon> )      -- this is the full OGC spatial operation
 AND population > 1000000;

The same query to the "normal shapefile" datastore will be processed differently. The shapefile datastore can perform bounding-box vs bounding-box operations quickly because the shapefile has the bounding of each geometry stored.

The read processing is done in two states: (a) shapefile optimized and (b) Java-code handled.

foreach row in the shapefile
     If the row's bounding box overlaps the <bounding box of the search polygon>
       THEN send this row to the next stage
       OTHERWISE this feature does not pass the Filter condition

Java code will then take the "approximate" solution that the datastore can quickly compute and fully evaluate the Filter.

The same query to the "indexed shapefile" datastore can be processed even more effiently. Instead of having to read large portions of the shapefile to test EVERY row to see if the bounding box intersects the search bounding box, it can just read a portion of the spatial index. Java code will then take this "approximate" solution that the datastore can very quickly compute and fully evaluate the Filter.

How a Get Feature Requests Works